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Abstract 

In  this  note,  we  provide  a  new  proof  for  the  nonoverlap  property  of  the  Thue- 
Morse  sequence  using  a  Boolean  functions  approach  and  investigate  other  patterns 
that  occur  in  a  generalization  of  the  Thue-Morse  sequence. 


1  Introduction 

The  Thue-Morse  (TM)  sequence  T  =  (tn)n> o  =  011010011001011010010-  •  •  is  defined  as 
the  limit  of  iterates  <^n(0),  where  the  map  ip  is  defined  by  <p(0)  =  01,  p(l)  =  10.  We 
denote  the  2n-length  initial  segment  of  the  TM  sequence  by  T2n.  It  can  be  seen  that  the 
TM  sequence  can  also  be  generated  by  setting  T\  =  0  and 

T2n  =  T2n-lT2n-l,  Tl  ^  1.  (1) 

or 

T2n  =  T2n-ir(T2n-i),  for  n  odd. 

T2n  =  T2n-ir(T2n-i ),  for  n  even, 

where  r(-)  is  the  map  that  reverses  the  bits  of  the  argument,  and  B  is  the  complement 
of  B.  Moreover,  the  TM  sequence  can  also  be  generated  by  using  the  bit  expansion  of 
the  position,  that  is,  if  i  =  bjW ,  then  tj  =  ]>T  bj  (mod  2).  So,  T  =  (tn)n> o  counts  the 
number  of  l’s  (mod  2)  in  the  base-2  representation  of  n.  A  sequence  has  the  nonoverlap 
property  (also  known  as  the  BBb  property)  if  the  subsequence  BBb ,  where  B  is  a  block 
of  bits  of  any  >  0  length,  and  b  is  the  first  bit  of  B,  does  not  appear  in  that  sequence. 
The  nonoverlap  property  was  originally  proved  by  Thue  in  his  seminal  papers  from  1906 
and  1912  [11,  12]  for  the  TM  sequence,  and  later  reproved  in  [8]  and  other  places  (see 
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[2,  3,  9,  10]  for  more  on  the  TM  sequence).  It  is  said  [2]  that  the  Thue- Morse  sequence 
was  the  start  of  what  we  now  call  combinatorics  on  words. 

Let  F2  be  the  vector  space  of  dimension  n  over  the  two  element  field  F2.  Let  us 
denote  the  addition  operator  over  F2  by  ©,  and  the  direct  product  by  The  vectors 
consisting  of  all  1,  respectively,  all  0  (of  some  length)  are  denoted  by  1,  respectively,  0. 
By  abuse  of  notation,  when  there  is  no  danger  of  confusion,  we  sometimes  use  1,0  to 
denote  a  binary  string  consisting  of  all  1,  respectively,  all  0.  A  Boolean  function  on  n 
variables  may  be  viewed  as  a  mapping  from  F2  into  F2.  We  order  F?;  lexicographically, 
and  denote  vo  =  (0, . . . ,  0, 0),  vi  =  (0, . . . ,  0, 1),  V2r>_i  =  (1, . . . ,  1, 1).  We  interpret 
a  Boolean  function  f(x\, . . . ,  xn)  as  the  output  column  of  its  truth  table ,  i.e. ,  a  binary 
string  of  length  2n,  /  =  [/(v0),  /(v  1),  /(v2), . . . ,  /(v2«_i)]. 

Let  e  :=  £162  •  •  •  be  a  sequence  of  q  €  {0, 1}  bits  (possibly  infinite).  Define  a  function 
rti  on  arbitrary  bit-blocks  B,  in  the  following  way: 


XB) 


B  if  a  =  0 
B  if  ei  =  1. 


(2) 


In  [4]  we  introduced  the  generalized  Thue-Morse  sequence  Te  =  {ten)n> 0  (we  called  it  the 
e-TM  sequence)  by  the  following  algorithm  (Tf  is  the  binary  string  made  up  of  the  first 
2,:  bits  of  Te): 


T\  —  to  €  {0, 1} 

Tf  =T|i_1rei(T|i_i) 

The  classical  Thue-Morse  sequence  is  Te,  where  e  =  11  •  •  •.  Since  [4]  was  published,  we 
learned  that  Keane  [7]  also  studied  this  generalization. 

In  [4]  we  proved 

Theorem  1.  The  initial  segment  of  length  2n ,  n  >  2,  of  the  TM  sequence  is  the  truth 
table  of  the  Boolean  function 


f{x i,x2,  ■  ■  • ,  xn)  =  Xl©X2©  •  •  •  ©xn, 

defined  on  ¥2  (ordered  lexicographically) .  Moreover,  given  an  initial  segment  T^n  of  length 
2n  of  a  generalized  Thue-Morse  sequence,  there  exists  an  affine  Boolean  function  f  (if 
to  =  0,  then  f  is  linear)  on  n  variables,  such  that  T^n  is  the  truth  table  of  f. 

We  also  define  the  following  set  B  of  4-bit  strings: 

B  =  {A  =  0,0, 1,1;  A  =  1,1, 0, 0;  B  =  0, 1, 0, 1;  B  =  1, 0, 1, 0; 

(7  =  0, 1,1,0;  (7  =  1, 0,0,1;  D  =  0, 0, 0, 0;  D  =  1, 1, 1, 1}. 

As  a  consequence  of  Theorem  1  we  have  the  next  corollary. 

Corollary  2.  The  Thue-Morse  sequence  can  be  written  as 

T  =  CCCCCCCC . . . 


(5) 
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In  [4],  we  showed  that  if  e  1,  then  the  e-TM  sequence  does  not  have  the  nonoverlap 
property,  and  we  raised  the  question  of  investigating  the  occurrence  of  other  patterns 
in  this  generalization.  In  this  short  note,  we  will  provide  yet  another  proof  (arguably 
the  simplest  known  proof;  we  use  a  Boolean  functions  approach)  of  Time’s  nonoverlap 
property,  and  find  other  patterns  in  the  e-TM  sequence. 

2  The  nonoverlap  property 

Theorem  3.  The  Thue-Morse  sequence  satisfies  the  nonoverlap  property. 

Proof.  Assume  that  the  TM  sequence  T  does  not  satisfy  the  nonoverlap  property,  and 
so  there  exist  blocks  B  (of  length  >  0)  such  that  BBb  occurs  in  T.  Take  n  to  be  the 
smallest  integer  such  that  T2n  contains  such  a  pattern  BBb.  We  assume  that  n  >  8,  since 
for  n  <  8,  one  can  check  easily  that  there  is  no  occurrence  of  the  pattern  BBb.  Write 
TV  =  T2n-iT2n-i.  If  there  exists  B  such  that  BBb  occurs  in  the  second  half  of  T2™,  then, 
Theorem  1  or  (1)  implies  that  BBb  must  occur  in  the  first  half,  and  so,  there  exists  an 
overlap  pattern  occurring  in  T2n- 1,  which  contradicts  the  minimality  of  n.  Further,  BBb 
cannot  occur  in  the  first  half  of  TV,  since  n  is  minimal.  Therefore,  BBb  must  intersect 
the  first  and  second  halves  of  TV,  as  pictured  in  Figure  1  (B  =  T?i T?2 ,  where  either  Bj 
could  be  empty),  where  the  pattern  BBb  is  shown  in  the  center  of  TV  and  is  split  by  the 
“dividing  line”  between  T2n-i  on  the  left  and  TV-i  on  the  right.  If  this  central  occurrence 
of  BBb  does  not  extend  beyond  the  blocks  T2n- 3  and  TV-3  on  either  side  of  the  dividing 
line,  then  Theorem  1  or  (1)  implies  that  the  pattern  BBb  also  occurs  in  the  leftmost  block 
T2™-2  inside  TV  (see  Figure  1),  which  contradicts  the  minimality  of  n. 


T2n  —  3  T2n—  3  T2n- 3  T2n-3  T2n- 3  T2«- 3  T2n- 3  T2n-3 


BBi 

B2b 

BBi 

B2b 

\ v  or  B\  B2Bb  j 

or  B\ 

j 

B2Bb 

T2n-2  ) 


T~V-i 

Figure  1:  Assuming  that  an  overlap  occurs  in  T2n 


Now,  we  consider  the  case  when  the  pattern  BBb  extends  beyond  the  blocks  TV- 3 
and  TV- 3.  Let  the  length  of  B  be  denoted  by  r,  and  assume  that  such  an  r  is  minimal 
for  any  pattern  BBb.  Let  /  be  the  Boolean  function  generating  T2n,  and  let  the  initial 
bit  of  B  be  b  =  f{vi). 

If  r  is  odd,  then,  since  /(vj)  =  /(vj+r)  =  /(vj+2.r),  and  since  the  length  of  the  block 
B  is  r  >  2n_4  >  24,  then,  there  must  be  two  blocks  C  (or  C)  at  distance  exactly  r  apart 
in  the  two  blocks  B.  This  means  that  the  middle  2-bit  blocks  S  =  11  (or  S  =  00)  must 
have  an  odd  distance  (r  —  2  bits)  between  them,  which  is  impossible,  since  we  know  that 
if  ti  =  ti+ 1,  then  i  must  be  odd.  Thus  assuming  r  odd  gives  a  contradiction. 
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If  r  is  even,  say  r  =  2s,  we  consider  two  cases  depending  on  the  parity  of  i.  If 
i  =  2 j,  then  f(v2j)  =  f(v2j+2s )  =  f(v2j+ 4S),  which  implies  that  f(vj)  =  f(vj+s)  = 
/(vj+2S).  Moreover,  replacing  k  by  2ko  in  /(vj)  =  f(vi+r+k)  (0  <  k  <  r,  k  even),  we 
obtain  f(v2j)  =  /(v2j+2s+2feo)  (0  <  k0  <  s),  and  so  /(v^)  =  f(vj+s+ko)  (0  <  &0  <  «)• 
This  implies  the  existence  of  a  pattern  BBb  with  B  of  size  s  <  r,  contradicting  the 
minimality  of  r.  If  i  =  2 j  +  1,  then  f(v2j+i)  =  f(v2j+2s+ 1)  =  /(v2j+4s+i),  which  implies 
/(vj)  ©  1  =  /(vj+s)  ©  1  =  f(vj+2s)  ©  1,  and  so,  /(v^)  =  f(vj+s)  =  f(vj+2s).  Further, 
as  before,  replacing  k  by  2fco  —  1  in  /(vj)  =  /(vj+r+fc)  (0  <  k  <  r,  k  odd),  we  obtain 
/(v2j+2fc0)  =  f(v2j+2s+2k0)  (0  <  A:o  <  s),  which  is  equivalent  to  f(vj+ko)  =  f(vj+s+ko ) 
(0  <  fc0  <  s),  which  again  gives  a  pattern  BBb  that  contradicts  the  minimality  of  r.  □ 

Next,  we  consider  the  generalized  Thue-Morse  sequence.  For  simplicity,  we  label  the 
patterns:  a  =  BBb,  f3  =  BBb,  7  =  BBb,  5  =  BBb.  Using  our  method  from  [4]  we  show 
the  next  result. 

Theorem  4.  If  e  =  1,  the  Thue-Morse  sequence  avoids  a  and  cannot  avoid  j3,j,  5.  If 
e  =  0,  the  Thue-Morse  sequence  avoids  j3, 7,  6  and  cannot  avoid  a.  If  e  /  1,0,  then  the 
e-TM  sequence  cannot  avoid  any  of  the  patterns  a,(5,r),5. 

Proof.  If  e  =  1,  by  Theorem  3,  we  know  that  a  is  avoided.  From  (5),  we  see  that  CCC, 
CCC,  CCC  occur  in  T,  and  so,  (3,^,5  are  not  avoided  in  T.  If  e  =  0,  then  the  e-TM 
sequence  is  simply  0000 . . .,  and  so,  the  second  claim  is  true. 

If  e  /  0,1,  we  showed  in  [4]  that  the  e-TM  sequence  does  not  have  the  nonoverlap 
property,  and  so,  a  =  BBb  must  occur  in  T.  Further,  since  e  7^  1,  0,  then  e  must  contain 
at  least  one  of  the  patterns  010,  0110,  0111.  We  shall  consider  these  cases  separately.  For 
easy  writing,  we  let  Bo  =  {to}  and  Bi  :=  T^,  ,i>l. 

Case  (i)  Let  eiej+ie,:+2  =  010,  for  some  i  >  1.  Then 

Hi+ 2  —  ^i+1  ^£i+2  (-®*+l)  —  Hi+l-^i+1 

=  Bi  rei+1  (Bi)  Bi  r€i+1  (Bi)  =  B,  B,  Bi  Bi 
=  Bi_  1  rei(Bi- 1)  1  rei(Bi-i)  Bt- 1  rei(Bi- 1)  B{- 1  reifBi-\) 

—  Bi— 1  Bi— 1  Bi— 1  Bi—i  Bi—\  Bi—  1  Bi—  1  Bi—\, 

which  contains  BiBiBi,  Bi-\Bi-\  Bi-\,  and  so,  ^,7 ,5  are  not  avoided 

in  T. 

Case  (ii)  Let  ejej+iej+2ej+3  =  0110,  for  some  i  >  1.  Then 

Bi+ 3  =  Bi+2Bi+2  =  Bi+iBi+i  Bi+\Bi+i 

=  Bi  Bi  Bi  Bi  Bi  Bi  Bi  Bi , 

which  contains  BiBiBi,  BiBiBi,  BiBiBi,  and  so,  /3, 7,  <5  are  not  avoided  in  T. 

Case  (in)  Let  ejej+iej+2ej+3  =  0111,  for  some  i  >  1.  Then 

Bi+ 3  =  Bi+2Bi+2  =  Bi+\Bi+i  Bi+iBi+\ 

=  Bi  Bi  Bi  Bi  Bi  Bi 

which  contains  BiBiBi,  Bi  BiBi,  BiBi  Bi,  which  implies  that  f3, 7, 5  are  not  avoided  in  T, 
and  the  theorem  is  proved.  □ 
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